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Abstract 

In this article, we propose several quantization based stratified sampling methods to reduce the 
variance of a Monte-Carlo simulation. 

Theoretical aspects of stratification lead to a strong link between the problem of optimal L^- 
quantization of a random variable and the variance reduction that can be achieved. We first em- 
phasize on the consistency of quantization for designing strata in stratified sampling methods in 
both finite dimensional and infinite dimensional frameworks. We show that this strata design has a 
uniform efficiency among the class of Lipschitz continuous functionals. 

Then a stratified sampling algorithm based on product functional quantization is proposed for 
path-dependent functionals of multi-factor diffusions. The method is also available for other Gaus- 
sian processes as the Brownian bridge or an Ornstein-Uhlenbeck process. We derive in detail the 
quantization of the Ornstein-Uhlenbeck process. 

The balance between the algorithmic complexity of the simulation and the variance reduction 
factor has also been studied. 
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Introduction 



The quantization of a random variable X consists of its approximation by a random variable Y taking 
finitely many values. This problem has been initially investigated for its applications to signal trans- 
mission and for compression issues. (See [H].) In this context, quantization was a method of signal 
discretization. The point of interest was to design the random variable Y in order to minimize the 
resulting error. This led to the concept of optimal quantization. 

More recently, quantization has been introduced in numerical probability to devise numerical inte- 
gration methods [21] and for solving multi-dimensional stochastic control problems such as American 
options pricing [T] and swing options pricing [2] . Optimal quantization has many other applications and 
extensions in various fields like automatic classification (quantization of empirical measures) and pattern 
recognition. 

Since the early 2000's, the infinite dimensional setting has been extensively investigated from both 
theoretical and numerical viewpoints with a special attention paid to functional quantization (the infinite 
dimensional case) [T71 [35]. Stochastic processes are viewed as random variables taking values in their 
path spaces such as := _L^([0, T], di). 

Still the Monte-Carlo simulation remains the most common numerical methods in the field of nu- 
merical probability. One reason is that it is easy to implement in an industrial configuration. In the 
industry of derivative, banks implement generic Monte-Carlo frameworks for pricing numerous payoffs 
with a wide variety of models. Another advantage is that the Monte-Carlo method can be parallelized. 

Variance reduction methods can be used to reduce dramatically the computation time of a Monte- 
Carlo simulation, or increase its accuracy. Main variance reduction methods are (adaptive) control 
variate, pre-conditioning, importance sampling and stratification il6j . The problem is that these 
methods may strongly depend on the payoff or the model and imply specific changes in the practical im- 
plementation of the Monte-Carlo method. Thus, most institutions do not implement the most advanced 
methods in practice except for marginal cases. 

In this paper, we point out theoretical aspects of quantization that lead to a strong link between 
the problem of optimal -quantization of a random variable and the variance reduction that can be 
achieved by stratification. We emphasize on the consistency of quantization for designing strata in 
stratified sampling methods in both finite dimensional and infinite dimensional frameworks. Then, we 
devise a stratified sampling algorithm based on product functional quantization for path-dependent 
functionals of multi-factor Brownian diffusions. We show that this strata design has a uniform efficiency 
among the class of Lipschitz continuous functionals of the Brownian motion. The simulation cost of 
the conditional path is 0{n) where n is the number of discretization dates, like for naive Monte-Carlo 
simulations. In this context, this stratification based variance reduction method can be considered as a 
guided Monte-Carlo simulation. (See figure [S)) The method extends to any Gaussian process as soon 
as its Karhunen-Loeve decomposition is explicitly known. So is the case for the Brownian bridge or the 
Ornstein-Uhlenbeck process. The special case of the Ornstein-Uhlenbeck process is derived in annex [Bl 

One very common situation is the case of Monte-Carlo implementations that are based on multi- 
factor Brownian diffusions approximated by their Euler scheme. The presented method is particulary 
adapted to this situation. Even in the multi-dimensional case, no matter how the independent Brownian 
motions are correlated or used afterwards ; no matter if it is used for diffusing the underlying stock, a 
stochastic volatility process or an actualization factor. Functional stratification can be used as a generic 
variance reduction method. The point is that it is used upstream in the Monte-Carlo framework. One 
does not need to re-implement the whole framework but only the way it is alimented with Brownian 
motions. Thus quantization-based functional stratification can come along on the top of a computation 
procedure. In the last section, numerical tests are provided with a benchmark with a Up-In-Call pricing 
in the Black and Scholes model. 

The paper is organized as follows. Section [1] presents the main results about optimal quantization 
that are required in the following. The emphasis is on the functional quantization of Gaussian processes. 
Section[2]presents the first historic quantization-based variance reduction method : using quantization as 
a control variate variable, as proposed in |221ll5j . Then section [3] outlines the links between quantization 
and stratification. The emphasis is on the Gaussian case. The method is specified in the functional 
case for Gaussian processes in section 2] We present a simulation method for the Brownian motion 
and other examples of Gaussian processes (as the Ornstein-Uhlenbeck process and the Brownian bridge) 
that preserves the 0{n) simulation complexity where n is the number of time steps. In section [S] we 
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provide numerical experiments of the method with option pricing problems arising in mathematical 
finance. Annex |X] provides the proof of a closed form expression for the Brownian motion used for 
functional stratification. Annex [B] presents the computation of the Karhunen-Loeve decomposition of 
the Orstein-Uhlenbeck process, and the related numerical methods. A pseudo-code for the decomposition 
computation is provided. 



1 Optimal quantization, the abstract framework 
1.1 Introduction to quantization of random variables 

In the following, (fi, A, P) is a probability space, and £' is a separable refiexive Banach space. The norm 
on E is denoted | • |. In the following, one will assume that the random variable are defined on {il,A,P). 
One denotes N* := {1, 2, • • • }. 

The principle of the quantization of a random variable X taking its values in E is to approximate X 
by a random variable Y taking a finite number N of values in E. The discrete random variable y is a 
quantizer of X. 

The resulting error of this discretization is the L^-norm oi X — Y . One wants to minimize this induced 
error. This gives the following minimization problem: 

min{||A: - Y\\p,Y : ^ E measurable, card{Y{n)) < N} . (1) 

Definition (Voronoi partition). Consider N G N* , T = {yi, ■ ■ ■ , y^} C E and let C = {Ci, • • • , Cat} be 
a Borel partition of E. C is a Voronoi partition associated with T ifyiG |1, iV|, Ci C G -E, |^ — = 
min |£ — Wilj. 

If C = {Ci, • • • , Cn} is a Voronoi partition associated with F — {yi, ■ ■ ■ ,yN}, it is clear that Vi G 
|1, Nl,yi G Ci. Ci is called Voronoi slab associated with yi in C and yi is the centre of the slab Ci. 
One denotes Ci — slahdyi), and for every a e F, M^(a|F) is the closed subset of E defined by iy(a|F) = 

y G E, \y — a| = min \y — b\ 

Definition (Nearest neighbour projection). Let us consider the settled point set F ~ {yi, ■ ■ ■ ,yN} C E 
and C = {Ci, • • • ,Cn} the associated Voronoi partition. The nearest neighbour projection on F is the 

N 

application Projp := ^ yild- 
1=1 



Proposition 1.1. Let X be an E-valued L^ random variable, and Y taking its values in the settled point 
set F = {yi, ■ ■ ■ ,?/jv} C E where iV G N. Set X^ the random variable defined by X^ := Projp(A') where 
Projp is a nearest neighbour projection on T, called a Voronoi T -quantizer of X . 
Then we clearly have \X - X^\ <\X -Y\ a.s.. Hence \\X - X^\\p < \\X - Y\\p. 

As a consequence of the previous remark, solving the minimization problem ^ amounts to solving 
the simpler minimization problem 

min{||A:-Projp(A:)||p, F C card(F) < TV} . (2) 

The quantity \\X — Projp(A')||p is called the mean L^-quantization error. When this minimum is 
reached, one refers to optimal quantization. 

The problem of the existence of a minimum have been investigated for decades on its numerical and 
theoretical aspects in the finite dimensional case [TU] . 

• For every A'^ > 1, the L^-quantization error is Lipschitz continuous and reaches a minimum. An A^- 
tuple that achieves the minimum has pairwise distinct components, as soon as card(supp(Px)) > N. 
This result stands in the general abstract case of a random variable valued in a reflexive Banach 
space. (This has been prooved in [17].) 

• If card(A(r2)) is infinite, this minimum strictly decreases to as A^ goes to infinity. The rate of 
convergence is ruled by theorem 11.21 in the finite dimensional case. 
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Theorem 1.2 (Zador). • Sharp rate (See JTU]) Let r > and X e L''+''(P) for some t] > 0. 
Let ¥x{d^) = (t>(0^^ + mC'^O ^6 the canonical decomposition of the distribution of X and the 
Lebesgue measure are singular). Then, (ifcj)^ 0), 

- + - 

eN,r{X,R'^) ^ Jr,d'x (^J (p^-{u)du^ X as N ^ oo, (3) 

where Jr^d G (0, oo). 

• NON ASYMPTOTIC UPPER BOUND (See [T^l) Let d> 1. There exists Cd,r,r) e]0,oo[ such that, for 
every -valued random vector X , 

VAT > 1, eAr,,(X,M'*) < Cd,r,jX\\r+-nN-i. (4) 

This mainly says us that min | ||X — X\\p, card(r) < ^ Cp^^p^dN~^ . The first statement of tlie 

theorem was first prooved for distributions with compact supports by Zador in ^7\. Then a first extension 
to general probability distributions on R'^ is developped in [S]- The first mathematically rigorous proof 
can be found in [TU]. The non asymptotic error bound of the second statement is prooved in [TS|. 

In figure [TJ the Voronoi partition of a random iV-quantizer and an L^-optimized N quantizer of the 
A/'(0,/2) distribution are given. 




Figure 1: Voronoi partition of a random quantizer and a L^-optimized A^-quantizer of the Af{0,l2) 
distribution in M^. {N = 20). 



1.2 Stationarity and centroidal Voronoi tessellations 

We now assume that £^ is a separable Hilbert space {H, (., .)h)- 

• Cn{X) is the set of L^-optimal quantizers of X of level N. 

• eisfiX) is the minimal quadratic distortion that can be achieved when approximating X by a 
quantizer of level N. 

Definition (Stationarity). A quantizer Y of X is stationary (or self-consistent) if 

Y = E[X\Y]. (5) 
Proposition 1.3 (Stationarity of L^-optimal quantizer). A (quadratic) optimal quantizer is stationary. 

The stationarity is a particularity of the quadratic case {p = 2). In other L^ cases, a similar property 
involving the notion of p-centre occurs. A proof of is available in 

A consequence is, iiY = Projp(X) is an L^-optimal quantizer, and C = {Ci, • • • , C„} is the associated 
Voronoi partition, one has Vy G T,y = K[X\X G s\ahc{y)]- 
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Proposition 1.4. Let X be an H -valued random variable. Let us denote the squared quadratic 
quantization error associated with a codebook of size N with respect to X. 



{xi,- ■ ■ ,xn) E 



min \X ~ Xi\jj 

l<i<N 



The distortion function is \.\h- differentiate at N -quantizers x G with pairwise distinct 

components and 

VD^ix) = 2( / (x,- O^xidO) ^ ^„ = 2(e(1i^(-) - . (6) 

^JCi(x) / l<i<N \ ^ ' / l<i<N 

Hence any Voronoi quantizer associated with a critical point of is a stationary quantizer. 



Definition (Centroidal projection). Let C — {Ci, • • • , Cn} be a Borel partition of H . Let us define ft 

1 < i < N, Gi — I ^^^^^ ^ ifV[X £ Cj] ^ 0, centroids associated with X and C. 

U m the other case. 



or 



N 

The centroidal projection associated C and X is the application Projp j^- : a; — > ^ Gilc^{x). 

1=1 

Lemma 1.5 (Huyghens, variance decomposition). Let X G L'^(P) be a H-valued random variable, 

N 

N Cz N* and C = {Ci)i<i<N o, Borel partition of H . Consider Proj,-; ^ = ^ Gild the associated cen- 

' i=l 

troidal projection. 
Then one has, 

Var(X) = E[\X - Projc:,x(X)p] + E[| Projc,;,(X) - E[X]\^ . 

: = (!) : = (2) 

The variance of the probability distribution X decomposes itself into the intraclass inertia (1) plus 
the interclass inertia (2). 

Proof of lemma: 

Var(X) = E[|X - Projc.x(^) + Projc,x(^) " E[X]|2] 

= nx ~ Projc,^(X)n +E[| Projc,x(^) " EWl'] 



=(2) 

+ 2E[(X - Projc.,^(X), Projc.,x(X) - E[X])] 



: = (3) 

Now (3) = since Projc^x(X) = E[X| Vm^cxiX)]- D 

1.3 Optimal quantization and principal component analysis 
1.3.1 Reduction of dimension 

Tire aim is now the reduction of the quantization problem to finite dimensional subspaces of H . For any 
finite dimensional subspace U of let liu denote the orthogonal projection from H onto U . 

Proposition 1.6. Let U be a finite dimensional linear subspace of H . Then 



eNiTluiX)f < CNiX)^ < inf E 



min IIX — alP 
aer 

E\\X^nu{X)r + eN{Uu{X))^ 



r C C/, 1< cardP < N 



In other words, the quadratic quantization error with respect to F C ?7 consists of the projection 
error and the quantization error of the projected random variable. Let us refer to [17J for a proof. 

Notation: Let d]s[{X) = min{dimspan(F), F G Cn{X)} denotes the quantization dimension of the level 
N of the quantization problem for X. 
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It follows from proposition II .61 that 

= {mix - n.(x,n + L'=JJ:sr>T.7x) } ■ 

1.3.2 Covariance operator of a random variable 

Definition. Let X be a centered H-valued random variable. 

The covariance operator Cx : H ^ H of X is defined by C'xy = ^[{y, X)X]. 

1. In the finite dimensional case, the matrix of Cx in the canonical basis is the covariance matrix of 
X. 

2. li X = {Xt)t£io,T] is a bi-measurable centered i^(P)-process with paths in L'^{[0,T], dt) a.s. and 
covariance fmrction Tx{s,t) := E[XsXf] satisfying Jj^ j,j Tx{s,t)ds < +oo. Then X can be seen as 
a L^([0, T], di)- valued random variable with E[||X||^] < oo. 

Cxy^ f y{s)Tx{s,-)ds, y£ L^[0,T],dt). (7) 

J[0,T] 

In [T7], it is prooved that linear subspaces U of H spanned by n-stationary codebooks of Gaussian 
measures correspond to principal components of X, in other words, are spanned by eigenvectors of Cx 
corresponding to the m largest eigenvalues. Thus these subspaces correspond to the first m principal 
components of X. 

Theorem 1.7. Let T be an optimal codebook for X, U = span(r) and m = dim [7. Then Cx{U) = U 
and K\\X — Ilu{X)\\'^ = ^ A^, where \^ > > • • • > are the ordered non-zero eigenvalues of 

j>m+l 

Cx (written as many times as their multiplicity). 

Xf = inf{E||X - Uv{X)\\'^\V C H linear subspace, dim = m}. 

j>m+l 

We now deduce the final representation of eN{X). 

eN{X)^= J2 ^^+eiv((g)AA(0,Af)) for m > (8) 

CNiX)^ < ^^+eiv((g)AA(0,Af)) for 1 < m < (9) 

j>m+l J — 1 

These two equations ([8]) and (|9]) show that for the quantization of a Gaussian process X, as soon 
as we know its Karhunen-Loeve basis (e^)„gN. and its eigenvalues (A^)„gN*, the problem of optimal 
i^-quantization comes to the problem of the quantization of a Gaussian vector of dimension dpf- 

1.4 Product quantization 

Let (e„)„gN* be a Hilbertian basis of H and / C N* is a non empty finite subset of N*. For every k £ L, 
consider a A^fc-tuple T^" = {xf ^ • • • ,x^l} C M. 

An easy way to construct a quantizer is to define the codebook F by the set of the points x such that 
for every fc g /, {x, Ck) e F^ and for every k S N*\/, {x, Ck) = E[{X, Ck)]- 

The Voronoi cells associated with such a codebook are hyper-parallelepipeds. 

Proposition 1.8 (Case of independent marginals). With the same notations, if one assumes that the 
marginals of X , {{X, ei) , {X, 62) , ■ ■ ■ ) are independent, then one can choose for each k £ L the values 
F'^ = {x^'' , ■ ■ ■ jX^^} such that — Projp/o ((X, e^)) is a stationary quantizer of {X,ek). Then Y = 
Projp(X) is a stationary quantizer of X . 

This method yields a stationary quantizer with a simple projection rule. 

A drawback of product quantization is that one needs to restrict to the case of independent marginals 
in order to preserve stationarity. 
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1.5 Numerical optimal quantization 



Various numerical algorithms have been developed to obtain numerically an optimal iV-grid with a 
minimal quadratic quantization error in the finite-dimensional setting. A review of these methods is 
available in [24j . Let us mention the Lloyd's algorithm for the quadratic case, which is the natural 
probabilistic counterpart of a classification algorithm due to Forgy [7]. 

Another algorithm is a stochastic gradient method which is suggested by the fact that the L^- 
quantization distortion function is differentiable at any A^-tuple having pairwise distinct components 
and a Px-negligible Voronoi tessellation boundary and has an integral representation. The algorithm is 
deeply investigated in [2Tj . 

Equation ^ shows that any Voronoi quantizer associated with a critical point of is a stationary 
quantizer. In the case of one dimensional distributions, as the Gaussian distribution, the Hessian of the 
distortion is known and can be represented by a tridiagonal matrix. Hence, it is easy to invert and a 
Newton-Raphson method can be implemented. It is completely detailed in [21 1 in the Gaussian case. It 
remains the fastest way to compute L^-optimal quantizers of one-dimensional Gaussian variables. 



1.6 Quantization of Gaussian processes 
1.6.1 Quantization 

From now on, we will assume that X is a bi-measurable Gaussian process defined on the probability 

T 

space {n,A,V) satisfying E[|A:|22] = /E[X2]ds < oo,. 

We have seen in section [T751 that in this context, as soon as one knows the Karhunen-Loeve system 
(ejf , A^)„gN* of the covariance operator of X, the problem of the L^-optimal quantization of the process 

m 

X comes to the quantization of the Gaussian vector A/'(0, A^^). The companion parameters of the 

m 

functional quantizer are easily deduced from the quantizer of A/'(0, A^) that is used. 

All this is valid for any Gaussian process X, except that one needs to know its Karhunen-Loeve 
basis. Several usual Gaussian processes have explicit Karhunen-Loeve expansions, like the Brownian 
motion and the Brownian bridge. The Ornstein-Uhlenbeck process admits a semi-closed form for its 
Karhunen-Loeve expansion. (The formula is derived for normalized parameters in the stationary case in 
P^2J, p. 195.) In section|Bl the computation of Karhunen-Loeve decomposition of the Ornstein-Uhlenbeck 

process is detailed in the general Gaussian case (rg ^ Af{mo, <Jq)). As far as we know, the K-L expansion 
of the fractional Brownian motion is not known. 

Further in the paper, numerical illustrations will be given for the following cases. 

1. The Brownian motion {Wt)t£[o.T]'- 

er(i) sin (^(n- 1/2)1), := { ^(^^^ y^) )' ' " " ^^^^ 

2. The Brownian bridge on [0,T]: 

en (t) ■- sin (nn-^y := (;^) , n > 1. (11) 

3. The Ornstein-Uhlenbeck process on [0, T], starting from 0, and defined by the SDE 

dn = -Ortdt + odWt : (12) 

e°^(f) := { ^ I sin(c.,„t), := , , n > 1, (13) 



2 4'^A„ / 

where (a;A„)n>i are the strictly positive solutions of the equation 

6'sin(a;A„T) + ujx„ cos{ujx^T) = 0, 

sorted in an increasing order. (Based on results from section IbI) 
4. The stationary Ornstein-Uhlenbeck process on [0,r]. (See section [Bl) 
On figure [5J one can see an A'^-optimal L^-quantizer of the standard Brownian motion. 
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Figure 2: Optimal quantizer of a standard Brownian motion on [0, 1]. 



1.6.2 Product quantization 

Thanks to equations ([5]) and ©, product quantization of the finite dimensional Gaussian vector ^ ^ 

m 

A/'(0, A^) yields a stationary quantizer of the process X. In this context, let us introduce the following 
notations: 

The quantizer oi X \s X — ^ \/ ^n ^n^n ^ where ^„ is an optimal iV„-quantizer of ^„ and Ni x • • • x 

n>l 

-^Ti < N, Ni, • • • , A^„ > 1. (Hence for large enough n, Nn = 1 so that ^„ = 0.) 

The paths of an iVi x • • • x A^„-quantizer x ^nd a multi-index i — {ii, ■ ■ ■ , i„, • • • } that produces this 
quantization are of the form 

xi-E\A?<"^"- (14) 

n>l 

A quantizer x defined by equation (|14p is called a K-L product quantizer. Furthermore, one denotes 
by Opq{X, N) the set of the K-L product quantizers of size at most of X. 
In the case of a product quantization, the counterpart of equation ^ is 



E[min \X - XiP] = E A^E [ ^ min ^ |e„ - x^f | + E A 



dec 



„=1 Ll<i„<N„ 



n=l 



. 1<!„<JV, 



(15) 



n=l 



1.6.3 Product decomposition blind optimization 

As a consequence, the lowest quadratic quantization error induced by a K-L-product quantizer having 
at most N codebooks is obtained as a solution of the minimization problem 



.[e{x),X^Op,{X,N)Y (16) 



that is, thanks to equation (115 

d 



min{^A^min||^-e^^")||2+ ^ A^, Ai x • • • x A^„ < A^, d > l}. (17) 

n— 1 n>d+l 

A solution of (|16p is called an optimal K-L product quantizer. 
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The blind optimization procedure consists of computing the criterium for every possible decomposition 
A^i X • • • X Nn < N. For a given Gaussian process X, results can be kept off-line for a future use. 
Optimal decompositions for a wide range of values of N for both Brownian bridge and Brownian motion 
are available on the web site www. quantize .maths -fi . com |23| for download. The blind optimization 
procedure is more thoroughly described in [2 2 . Let us remind that the optimal decomposition depends 
on the parameters of the Orstein-Uhlenbeck process {a and 9 in equation (jl2p ') and the maturity. 

Some values of optimal decompositions for the stationary Ornstcin-Uhlenbeck process are given in 
table El 



N 


Nrec 


Squared L'^ Quantization Error 


Nrec decomposition 


1 


1 


1.5 


1 


10 


10 


0.65318 


5 - 2 


100 


96 


0.40929 


6-4-2-2 


1000 


960 


0.29618 


10-6-4-2-2 


10000 


9984 


0.23150 


13-8-4-3-2-2-2 



Figure 3: Record of optimal product decomposition values of the stationary centered Ornstein-Uhlenbeck 
process given by drt — —Ortdt + adWt on [0, T] with 6* = 1, cr = 1 and T = 3. 



Proceeding in this article, we will be confronted with other similar optimization problems (with 
another criterium than the quadratic distortion). The blind optimization procedure will be the way to 
compute optimal product decomposition databases. 

In figure SI one can see examples of optimal product quantizers of the Brownian motion and the 
Brownian bridge on [0, 1]. In figure SI one can see optimal product quantizers of the centered Ornstein- 
Uhlenbeck process starting from rg = and a stationary Ornstein-Uhlenbeck on [0, 3]. 




Figure 4: Optimal product quantizer of a standard Brownian motion (left) and a standard Brownian 
bridge (right) on [0, 1]. 



1.6.4 Rate of decay for the quantization error 

In [T7J, a precise link between the rate problem and Shannon- Kolmogorov's entropy of X is estab- 
lished. This allowed them to compute the exact rate of convergence of the minimal L^-quantization 
error under rather general conditions on the eigenvalues of the covariance operator. Typical rates are 
0(log(n)~''), a > 0. This conditions are fulfilled by a large class of processes as the Ornstein-Uhlenbeck 
process and the Brownian motion. 
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Figure 5: Optimal product quantizer of a centered Ornstein-Uhlenbeck process, starting from tq = 
(left) and stationary (right) given by drt — —rtdt + dWt, on [0, 3]. 



2 Quantization as a control variate: a first attempt to quantiza- 
tion based variance reduction 

This method has been initially proposed in [22j. 

2.1 Quantization as a control variate variable 

Let X : {il, P) — > i? be a square integrable random variable, condider N E N* and let F = {yi, • • • , i/n} 

N 

be an 7V-codebook. We suppose that we have access to a F- valued quantizer Y — Proj(X) — ^ yilc\{x) 

i=l 

where C = {Ci, • • • , Cat} is a partition of E. At this step, we do not need Proj to be a nearest neighbour 
projection on F. 

Let F : E ^ E he a. Lipschitz continuous function such that F{X) G _L^(P). In order to compute 
E[i^(X)], one writes: 



i[F{X)] = E[F(Proj(X))] + E[F(X) - F(Proj(X))] 



E 



M 

[F(Proj(X))] + - ^ F(X(™)) - F(Proj(X(™))) +R 



N,M, 



(18) 



(a) 



m— 1 



(6) 



where I < m < M are M independent copies of X, and Rn,ai is a remainder term defined by 

equation p^ . 

Here, term (a) can be computed by quantization and term (6) can be computed by a Monte-Carlo 
simulation. Now 

11^^ ^11 ^ a(F(X)-F(Proj(X))) ^ ||F(X)~F(Proj(X))||2 



Lip 



/M 

||X-Proj(X)||2 



Furthermore, VMRnm AA(o, Var (i^(X) - F(Proj(X))^ ) . 

Consequently, in the d-dimensional case, if F is simply a Lipschitz function and if (Y/v)7veN 
(Proj^(X))Ar£N is a rate optimal sequence of quantizers of X, 



\\F{X)-FiProf{X))h<[F]up 



Cx 



and 



\Rn,m\\2 < [F] Lip 



Cx 



7\//l/2^1/d- 
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Likewise, in the case of the Brownian motion, if (W^) n>i is a rate optimal sequence of product 
quantization of tlie Brownian motion, if F is simply a Lipschitz functional, then 



\\F{W)-F{W^)h < [F]l^ 

and 



JVmi_ ^ FPU. 

'^l0g(7V)l/2 



I^w.mIlS, II2 < [F]Lip- ^ 



'Mlog(iV)i/2- 

2.2 Practical implementation: the problem of fast nearest neighbour search 

• The complexity of the projection: Concerning practical implementation, one notices in equation 

that for every step of the Monte-Carlo method, one has to compute the projection Proj(X*^"')). This 
is the critical part of the algorithm when dealing with optimal quantization. Hence, the efficiency of the 
quantization as a control variate variable is conditioned by the efficiency of the projection procedure. 
When dealing with Voronoi quantization, this is the nearest neighbour projection. 

The problem of nearest neighbour projection, also known as the post office problem [14 1, has been 
widely investigated in the area of computational geometry. It is encountered for many applications, as 
pattern recognition and information retrieval. 

The problem has been solved near optimally for the case of low dimensions. Algorithms differ on 
their practical efficiency on real data sets. For large dimensions, most solutions have a complexity that is 
exponential with the dimension, or require a bigger query time than the obvious brute force algorithm. 
In fact for dimension d > log TV, a brute force algorithm is usually the best choice. This effect is known 
as the curse of dimensionality. Still, even in low dimension, fast nearest neighbour search is a critical 
part of the algorithm. Let us refer to [55] for a review about fast nearest neighbour search algorithms. 

Concerning vector quantization, the speed of the projection can also be increased by relaxing the 
hypothesis within the projection on the quantizer is a nearest neighbour projection. It can be done by 
designing other kind of partitions of the state space. 

• The functional case: One other drawback of the method, when dealing with the functional case 
is that one does not simulate the whole trajectory of the stochastic process but only its marginals at 
discretes dates. Hence it is not possible to compute its projection. This problem finds its solution in the 
simulation scheme for Gaussian processes derived in section 14.21 for the functional stratification. 

A variance reduction technique using a functional quantizer of the Brownian motion as a control 
variate has been proposed in [15] . 



3 Application of quantization to stratification 
3.1 A short background on stratification 

The base idea of stratification is to localize the Monte-Carlo method on the element of a measurable 
partition of the state space of a random variable X : (il. A) — ?> (E, e). 

• Let {Ai)i^i be a finite e-measurable partition of E. The sets Ai are called strata. Assume that the 
weights Pi = ¥{X G Ai) are known for i G / and strictly positive. 

• Let us define the collection of independent random variables {Xi)i^i with distribution £{X\X G 

Remark: One assumes that one can write Xi = (j)i{U) where U is uniformly distributed on [0, l]*"' 
and (p : [0, l]*"' M is an easily computable function. (One has G N U {-l-oo}, the case = +00 
occurs for example in the case of the acceptance-rejection method.) This condition simply means that 

the random variables Xi ^ £{X\X G Ai) are easy to simulate on a computer. 

It is a major constraint for practical implementation of stratification methods. This simulability 
condition usually has a strong impact on the possible design of the strata. In the following, one will 
come back several times on this condition. 
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Let F : {E,e) (R,e(R)) such that E[F2(X)] < +00. 

The stratification concept comes into play now. Let M be the global budget allocated to the com- 
putation of E[F(X)] and Mi = qiM the budget allocated to compute E[F(Xi)] in each stratus. One 
assumes that E 9i = 1- This leads to define the (unbiased) estimator of E[F(X)]: 

, Mi 

mM---T.p^j^T.^(^'^^ (19) 

iei * k=i 

where {X^)i<:k<Mi is a C{X\X G AJ-distributed random sample. 
Proposition 3.1. With the same notations: 



Ye.r{FiX)i,) = ^J2^ah, (20) 

iei * 

where crl - = Y£it{F{X)\X G A,) = Var(F(X,)) G /. 



Mi 

Proof: Let us denote = E ^i^i)- ^I'e independent. 

' k=l 

One has F{X)j^j — E Pi'Z^i- Hence, by independence, 

Var(FWi/) - EP?Var(Z,) = 1 ^^^^^^^^^^ = ^ ^ 



ie/ ie/ * iei ^ 

□ 

Optimizing the simulation allocation to each stratus amounts to solving the following minimization 
problem: 

2 

min V^4z where Vi = {{q^)^eI G Vg. - l|. (21) 

3.1.1 Sub-optimal choice 

The first natural choice is to set 

q^=p,, tel. (22) 

The two motivations for this choice are the facts that the weights pi are known and because it always 
reduces the variance. 

E f 4.. = Ep.4,. - E^[{f{x)-e[f(x)\x g a,])\a^{x) 

= fkx) - E[FiX)W{{X G A,}, r G /)]||i 
<||F(X)-E[F(X)]||i=Var(F(X)). 

3.1.2 Optimal choice 

The optimal choice is the solution of the constrained minimization problem (j2ip . The Schwartz inequality 
yields 



iei i£i iei ^* iei 

=1 
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As a consequence, the solution of the minimization problem corresponds to the equality case into the 
Schwartz inequality. Hence the solution of the minimization problem is given by 



(23) 



and the corresponding minimal variance is given by I 'Y^ PiCrp, 

At this point, the problem is that one does not know the local inertia cr'^p^. Still, using the fact that 
LP norms are decreasing with p, one sees that 



(JFi > E 



\F{X) ~ nF{X)\{X e A,}] I {X e A,} 



so that 



> 



F{X) -E[F{X)\<j{{X e A,}, I e I)] 



In , Etore and Jourdain proposed an algorithm for adaptively modifying the proportion of further 
drawings in each stratum, that converges to the optimal allocation. This can be used in a general 
framework. 

In section 1321 we will see that the problem of designing good strata, in term of variance reduction is 
linked with the problem of optimal quantization. Moreover, the case of quantization based strata have 
two other advantages: 

• The weights pi are already known, which saves us from evaluating their values during the Monte- 
Carlo evaluation. 

• As concerns the optimal choice for the allocation parameters qi , one shows in theorem 13.21 that 
weights can be chosen such that stratification has a uniform efficiency among the class of Lipschitz 
continuous functionals. This weights have an explicit expression in the case of quantization based 
stratification. 



3.2 Stratification and quantization 

The main drawback induced by using quantization as a control variate variable is that it requires re- 
peated computations of projections on the quantizer. (Nearest neighbour search in the case of a Voronoi 
quantizer.) The point when dealing with stratification is that one does not have to use a projection 
procedure. 

The critical point now is the cost of the simulation of conditional distributions C{X\X S Ai), i g /. 

Theorem l3.2l brings together previous results about stratification and highlights the relationships with 
the notions of local inertia and intraclass inertia. It stresses the fact that stratification has a uniform 
efficiency among the class of Lipschitz continuous functionals. 

Theorem 3.2 (Universal stratification). Let A — {Ai)i^j be a partition (stratification) of E. (Keep in 
mind the notation Proj^ ^ for the centroidal projection of the distribution Z on a partition A, defined 
in definition M.i 



1. For every i L , consider the local inertia of the random variable X , 



at =E \X-W,X\X G Af\\ 

Then, for every Lipschitz continuous function F : E M 
Vi e /, aF,i < [FjLip^i so that 



X eA, 



sup (TF,i < cn 



(24) 
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2. In the case of the sub-optimal choice (see section \8.1.1]) . 



sup ( EPjCTf j) < EPjCrf 



X -E[X\ai{X e A,},ie I)] 



(25) 



3. In the case of the optimal choice (see section \3.1.2\) . 

sup [^Pi(^i',i) < (y]p< 

and 



(26) 



(E 



iei 



> 



X -E[X\ai{X G A,},ie I)] 



X-Prou.xiX) 



4- If one considers vector-valued Lipschitz continuous functions F : E ^ E, then inequalities 
\2d^) and i26]) hold as equalities. 

Proof: One has 

= Var(F(X)|Xe A,) 

= e[\f{x)-e[f{x)\x e A,]\^\x e A,] 
< E[\F{X) ~ F{E[X\X e A,])f |X e A^]. 

Now using that F is Lipschitz, it foUows that 

4,. < [F]hp-E\\X-E[X\X e A,]\h^xeA,^] = [F']upal 

Pi ^ J 

Items [5] and [3] easily follow from item[TJ Claim 5] is obvious by considering F — Ids- □ 
The general case: The idea is now to use the partition {^i, • • • ,An} and the A^-codebook F — 

N 

{yi, ■ ■ ■ tVn} associated with the projection Proj(a;) — ^ yilAi{x). 

In the case of a Voronoi quantization, this amounts to setting / — {1, • • • , A^} and Ai = slab/i(a;i). 
Then for every i € {1, • • • , iV}, there exists a Borel function ^(xi, .) : [0, l]"* E such that <l)(xi, U) ^ 
C{X\X € a) = ^rixecf ^ ^^ere U ^ U{[0, 1]'). 

Now let (^, U) be a couple of independent random variables such that f has the distribution of 

Y = Proj(X) and U ~ 14{[0,1]'^). Then one checks that 0(^, [/) has the same distribution as X, so 
that one may assume without loss of generality that X = (/)(Proj(X), U) and which in turn implies that 
e = Proj(X) i.e. 

X = (?!)(Proj(X), U), where U - U{[0, If) is independent of Proj(X). 

In terms of implementation as mentioned above, one needs a simple form for the function cj) (in term 
of computational complexity) which induces some stringent constraints on the choice of the strata. 

3.3 Simulability for hyper-rectangles strata in the independent Gaussian 
case. 

Consider a random variable X ^ A/'(0, Id), d > 1. Let (ei, • • • , e^) be an orthonormal basis of i? = M''. 
We set iVi, • • • , Nd > 1 the number of strata in each direction. So we consider for 1 < i < d, —oo = Xq < 
x\ < ■ ■ ■ < x]^, — +00. The strata are 

d d 

A,_ = Yl {x e R'' snchtha^t {ei,x} e [yi_i,yi]}, leni^'"' 



1=1 



1=1 
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d 

Then for every multi- index Till'''' 

1=1 

d 

£{X\X e A,) = C{Z\Z e [4_i,4]), where Z ~ AA(0, 1). 
1=1 

d 

Then pi = P{Ai) = J] {-^i^ik ) - )) and for -oo < a < 6 < oo, 

k=l 

C{Z\Z e[a,h])=M-^({U{b)^M{a))U +U{a)), U ^U{%1]). (27) 

4 Functional stratification of a Gaussian process 

In the functional case, the state space of the random values are functional spaces. What is usually done 
is to simulate a scheme to approximate marginals of the underlying process. 

In this section, we assume that X is an M- valued Gaussian process on [0,r]. We are interested in 
the value of &^F{Xta , Xt^ , • • • , ^t„ )] where = to < ti < • • • < i„ = T are n + 1 dates of interest for the 
underlying process. 

(For example, X can be a standard Brownian motion on [0,T], and V the risk-neutral expectation 
of a path-dependent payoff of a diffusion based on X .) 

What is done in this section can be easily generalized to multi-dimensional processes in the case where 
their coordinates are independent. (For example, when dealing with multi-factor Brownian diffusions, it 
does not matter how the Brownian motions are being correlated afterward.) Still we restrict ourselves 
to the one dimensional setting for clarity. 

Let us assume that x G Opq{X, N) is a K-L product quantizer of X. The codebook associated with 
this product quantizer is the set of the paths of the form 

n>l 

with the same notations as in section [1.6.21 

We now need to be able to simulate the conditional distribution 

CiX\X e A,) 

where Ai is the slab associated with Xi in the codebook. 

To simulate the conditional distribution C{X\X £ Ai) , one will : 

• First, simulate the first K-L coordinates of X, using (P7)) . 

• Then simulate the conditional distribution of the marginals of the Gaussian process, its first coor- 
dinates being settled. 



4.1 Simulation of marginals of the Gaussian process, given its d first K-L 
coordinates. 

In this setting, the aim is to simulate the conditional distribution 

c(Xt„--- ,Xt„\J^ Xse^ds,J^ Xse^{s)ds,--- , X,ef(s)ds) (28) 

where (Xt)(g[0,T] is a R-valued Gaussian process, and {e^ ,X^)keN* is the Karhunen-Loeve system 
associated with the process X. 

As X is a Gaussian process, (^Xt„, • • • , Xt^, J^^ X^e^ {s)ds, ■ ■ ■ , XsC-^ {s)ds^ is a Gaussian vector. 

J^Xse^is)d.s 

Hence, if we denote F := | : | and y := | : | , the conditional distribution ([28)1 is 

J^X,e^{s)ds 





( \ 


and V := 
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IS an 



given by the transition kernel A) = M cov("V^ — E["V^|y])) , where Afy^y 

affine function corresponding to the Unear regression of V on Y, Afy^yiY) := E[V\Y]. 

• The conditional expectation writes ^/y|y (F) — E[V] + Rv\yY where Rv\y = cov(V, Y) cov(F)~"'^. 
As cov(F) = ( (Xf5ij)) , and coy{V,Y) = {{X^ {ti)))i<k<d,o<i<n, one has 

V \ / / l<i,j<d 



Rv\Y 

The covariance matrix is 

K :=coy{V-E[V\Y]) 



efiU))) 



0<i<n,l<j<d 



= E[{V - Rv\YY)iV ~ Rv\yY) 

= cov{V) - 2 cov{V, RviyY) + coy{Rv\Yy) 
= coy{V) - coy{Rv\yY) 



= (J^coy{Vi,Vk)-J2\ief{ti)ef{tk)yj 



0<k,l<n 



(29) 



Now, we are able to simulate according to this probability distribution. 

The easiest way of doing this in the definite positive case is to compute the Cholesky factorization of 
the matrix K, but in this case, the simulation of a simple path requires an n x n matrix multiplication, 
which complexity is quadratic. This solution is not satisfactory for our purpose. 



4.2 Faster simulation of conditional paths - Bayesian simulation 

As pointed out above, the natural method to simulate £(y|F) requires for each path a multiplication 
by a Cholesky transform of K whose cost is O(n^). This cost is to high. 

• Yet, in the context of this paper, d is the quantization dimension of the process. It is close to log(A'') 
if N is the number of strata, and n, the number of time steps, is usually very large compared to d. 

• Moreover, we make the assumption that the cost of the simulation of {Xt^, ■ ■ ■ ,^t„) is 0{n). (So 
is the case for the Brownian motion, the Ornstein-Uhlenbeck process or the Brownian bridge for 
example.) 

• The idea here is that the conditional distribution £(V^|y) is determined through the Bayes lemma, 
by the conditional distribution £(y|y) and the two marginal distributions jC{V) and jC{Y). 

One knows that V = E[V\Y] + Z where Z ~ A/'(0, cov(F - IE[y|y])) is independent of Y. Hence 
one is able to simulate according to £(y|y = y) if one can simulate the distribution of Z, writing 
CiV\Y^y)^E[V\Y = y]+C{Z). 

This decomposition corresponds to the splitting of the Karhunen-Loeve expansion: 




~ =Yk V ^fc 



E 

l>d+l 



ef(in) 



=z 



=S[V\Y] 

To simulate Z, one simulates the distribution of V and the conditional distribution C{Z\V). 

One has C{Z\V) ^ V - C{E[V\Y]\V) ^ V - Afv\YC{Y\V) 
^V- Afy^YmnY\V],coviY - E[Y\V])). 

If AfY\v is the affine function corresponding to the regression of y on V and Ry\v its linear part, 

cov(F - E[Y\V]) = cov(r) + cov(E[r|F]) - 2cov(y,E[y|F]) 
= cov(y) - RY\vCOY{VyRY\V 
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This yields Z = V - Ajy\Y(G) where G - A/'(A/y|y (F), cov(r) - i^y,,/ cov(y)*i?y|y). 
Finally, the algorithm writes: 



• Simulate Y . 


('cosi 0/ 0(n).j 


• Simulate G ~ A/'(A/y|y (F), COv(r) - i?y|y COv(F)*i?y|y) 






fcosf o/0(d X d)}. 


• Compute Z = Y - A/y|y(G). 


(cost of 0{d X n) ). 


• The random variable T = A/y|y(y) + Z satisfies T - = y). 





Let us remind the fact that the afhne function Afyyf is trivially defined in equation ((29| . because 
coordinates of Y are independent. Other matrices implied in this algorithm are computed prior to any 
Monte-Carlo simulation. 



In the general case, the matrix Ry\v needed by the method can be computed by performing a 
numerical least square regression. 

Still, in the case of the Brownian motion, there is a closed form for the matrix Ry\v- If — — i^i 
< J < n, this yields Ry^y = ((a^ ))i<i<d,o<j<n, with 

. for J i {0,n}, a,, = yW 2^ i^^-^ i^^^-^ it..^ ^ 

The proof is available in section [X] The case of a non uniform subdivision is handled as well. 

Now, we have a very fast and easy way to simulate the conditional distribution H28p at our disposal. 

In figures |6] and [71 we plot a few paths of the conditional distribution of various Gaussian processes 
knowing that they belong to a given I? Voronoi cell. The appearance of the drawing suggests to consider 
the method as a "guided Monte-Carlo simulation". 

4.3 Blind optimization procedures 

We have seen in section [3.21 that the quantity = ( Vi'^i ) is an upper bound of the variance of 

the estimator, given in equation (I19p in the case where the functional is 1-Lipschitz continuous. Hence one 
may want to minimize this criteriuni instead of the i^-quantization error. This yields the minimization 
problem 

niin{d(x),x€Op,(X,7V)} (30) 

instead of the minimization problem (jl6p . 

The same kind of blind optimization procedure as in section [1.6. 31 can be performed. Some values of 
the optimal decomposition for the standard Brownian motion are given in table [51 

Optimal product decompositions for both Brownian bridge and Brownian motion and for a wide 
range of values of N are available on the web site www. quantize .maths-fi . com [23J for download. 
When comparing all the decompositions obtained for a quantizer size smaller than 11000, one notices 
that in the case of the Brownian motion, the optimal decompositions for both criteria are "almost" 
always the same. The only values where decompositions differ are the ranges 270 — 271 and 3328 — 3359. 
The two criteria do not have very different values for the two decompositions. Therefore, in practice, 
one can use the same decomposition database for the two applications. 

Nonetheless, in the case of the Brownian bridge and the Ornstein-Uhlenbeck process, one notices that 
the optimal decompositions for the variance and the optimal decomposition for the L^-distortion differ 
more often. 
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Figure 6: Plot of a few paths of the conditional distribution of the Brownian motion, knowing that its 
path belong to the Voronoi cell of the highlighted curve in the quantizer. 
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Plot of a few paths of the conditional distribution of the Brownian bridge (left) and the 
Ornstein-Uhlenbeck process (right), knowing that its path belong to the Voronoi cell of 
;hted curve in the quantizer. 
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Figure 8: Record of optimal product decomposition record values of the standard Brownian motion with 
respect to the criterium ([50)) . 
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4.4 Functional stratification of diffusions 

When dealing with diffusions with respect with a (multi-dimensional) Brownian motion, most Monte- 
Carlo methods imply a discretization scheme as the Euler scheme [S]. In this case, one replaces the 
Brownian motion by a stratified Brownian motion in the Euler scheme. This method is justified in many 
aspects: 

1. Let us consider the diffusion 

dXt^b{t,Xt)dt + a{t,Xt)dWt, Xq = xq, te[0,T], where (31) 

• CT e C^{[0,T] X R,K) positive and bounded, 

• y{t,x) e [0,T] X K, \b(t,x)\ < C(l + \x\). 

In the one dimensional setting, as soon as the drift of the Lamperti transform of the SDE ([51]) 
is Lipschitz continuous, it is prooved in [TH] that the unique strong solution of (|5T|). seen as a 
functional of the underlying Brownian motion is || • ||p-Lipschitz continuous. 

Hence one stands in the case of a Lipschitz continuous functional where one can use the results of 
section [3.21 about universal stratification. 

2. The function {Wto, • • • , Wt^) (Wt^ — Wtg, • • • , Wt„ — Wt„_i) that maps the marginals of the 
Brownian motion to the corresponding Euler scheme is linear from IR"+^ to M" and thus Lipschitz 
continuous as well. 

In next section ([S]), numerical examples are given when replacing the Brownian motion by a stratified 
Brownian motion in the Euler scheme. 

5 Application to option pricing 

Now, we are able to simulate the conditional distribution of a Gaussian process, given one of its Voronoi 
cell in a product quantizer. One condition is to know an orthonormal Hilbert basis that diagonalizes its 
covariance operator. The cases of the Brownian motion, the Brownian bridge and the Ornstein-Uhlenbeck 
process have been handled. 

The particular case of the Brownian motion allows to use functional stratification as a generic variance 
reduction method for the case of functionals of Brownian diffusions. Even in the multi-dimensional case, 
no matter how the independent Brownian motions are correlated or used afterwards ; no matter if it is 
used for diffusing the underlying stock, a stochastic volatility process or an actualization factor. It can 
be used as a variance reduction method. 

Hence, this is a very interesting variance reduction method to be used in an industrial way, indepen- 
dently of the path-dependent payoff or the model (as soon as it uses Brownian diffusions or one of the 
other proposed Gaussian processes). Users do not have to set up complicated adjustments when using 
it. 

In the following of this section, the method is used to illustrate its performance on simple one 
dimensional cases. One begins with the case of a continuous time Up-In Call in the Black and Scholes 
model, for which a closed formula is known, and used as a Benchmark. 

5.1 Benchmark with an Up-In-Call pricing in the Black and Scholes model. 

Here, one benchmarks the numerical method for a path dependent option in a case where a theoretical 
value is known : a barrier option in the Black and Scholes Model. 

For the sake of simplicity, consider a log-normal Black and Scholes diffusion with no drift (no interest 
rate and no dividend). 

One has a closed form for the continuous barrier option. A numerical correction proposed by Broadie 
and Glasserman [4] is done to get the closed-form price to be compared to. The number of Monte-Carlo 
simulations is 100000 in every case. 



19 



One prices an Up-In-Call with different values of the initial spot S, the strike K, the barrier H, the 
volatility cr, the maturity T, and the number of fixing dates for the discrete barrier n. In every case, a 
95% confidence interval is given. So is the variance of the estimator. 

The numerical results are reported in table ^ when using the method with 20 stratas and table [TUl 
when using the method with 100 stratas. In this tables, the first column correspond to the Broadie and 
Glasserman's closed form proxy. The second one corresponds to a simple Monte-Carlo estimator. The 
last three columns correspond to a stratified sampling estimator with different simulation allocation for 
each strata. 

The "sub-optimal weights" column stands for the allocation budget of equation (22). The "Lip.- 
optimal weights" column stand for the "universal stratification" budget allocation proposed in theorem 
13.21 Both these two case have explicit allocation rules. Last column, "Optimal weights" corresponds to 
an estimation of the optimal budget allocation given in expression (|23p . 



Parameters 


Broadie & 
Glasserman's 
proxy 


Simple 
Estimator 


Strat. Estimator 
sub-optimal weights 


Strat. Estimator 
Lip. -optimal weights 


Strat. Estimator 
Optimal weights 


S = 100, K = 100 
H = 125, cr = 0.3, 
T = 1.5, n = 365 


13.9597 


14.0379 
[13.8705, 14.2053] 
Var = 729.2518 


13.9281 
[13.8491, 14.0071] 
Var = 162.4650 


13.9283 
[13.8519, 14.0047] 
Var = 151.9481 


13.9364 
[13.8827, 13.9901] 
Var = 75.1319 


S = 100, K = 100 
H = 200, cr = 0.3, 
T = 1, n = 365 


1.3665 


1.4206 
[1.3442, 1.4969] 
Var = 151.6366 


1.3659 
[1.3106, 1.4211] 
Var = 79.5118 


1.3510 
[1.3039, 1.3981] 
Var = 57.7425 


1.3602 
[1.3472, 1.3732] 
Var = 4.4053 



Figure 9: Numerical results for the Up In Call option, with 20 stratas. 



Parameters 


Broadie & 
Glasserman's 
proxy 


Simple 
Estimator 


Strat. Estimator 
sub-optimal weights 


Strat. Estimator 
Lip. -optimal weights 


Strat. Estimator 
Optimal weights 


S = 100, K = 100 
H = 125, cr = 0.3, 
T = 1.5, n = 365 


13.9597 


14.0379 
[13.8705, 14.2053] 
Var = 729.2518 


13.9382 
[13.8720, 14.0043] 
Var = 114.0634 


13.9511 
[13.8874, 14.0150] 
Var = 105.8760 


13.9483 
[13.9047, 13.9919] 
Var = 49.5071 


S = 100, K = 100 
H = 200, cr = 0.3, 
T = 1, n = 365 


1.3665 


1.4206 
[1.3442, 1.4969] 
Var = 151.6366 


1.3296 
[1.2825, 1.3768] 
Var = 57.8899 


1.3493 
[1.3093, 1.3893] 
Var = 41.6666 


1.3611 
[1.3508, 1.3715] 
Var = 2.8099 



Figure 10: Numerical results for the Up In Call option, with 100 stratas. 



5.2 Test with an auto-call pricing in the CEV model. 

Here, we stand in the case were the stock follows a CEV model with no drift 

dSt^aSfdWt, 0</3<2. 
The simulation scheme that is used here is a Euler scheme on In(S't). One has 

dln{St) = -Y^r^dt + asf~^dWt. 

Let us remind the fact that there are closed-form expressions for vanilla option pricing in this model 
that can be expressed as a function of the noncentral chi-square distribution [13.j . A first test of consis- 
tency for the method was to check that we could find the same price when performing such a Monte-Carlo 
simulation. The tested path-dependent payoff that we consider here is the so-called "auto-call" payoff. 
Description of the auto-call payoff: 

St is the stock price at time t and ti < ■ ■ ■ < tn — T is a schedule of observation dates. K and H, 
the "strike" and the "barrier" are two settled values with which S will be compared to. P denotes the 
"nominal", and C a bound. 

At the first date ti of the schedule, if St^ > K, the holder of the option gets (1 + C)P and the product 
stops. If < K, one waits until the second date of the schedule. If St2 > K, the holder gets (1 + C)P 
and the product stops. And so on... If St does not reach K until the last date tn = T. 
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At t„, if St > K, the holder gets (1 + C)P. If B < St < K, the holder gets P and if St < B, he 
gets P^. 

The numerical results are reported in table [TT] when using the method with 20 and 50 stratas. The 
parameters of the model are /3 = 1.5, Sq = 100, a = 0.3. For the payoff, K = 110, H = 80, P ^ 100, 
C = 0.07. The considered observation dates are {1, 2, 3}. The number of time steps in the Euler scheme 
is 300 and one performs 100000 Monte-Carlo simulations in every case. 



Number of strata 


Simple 
Estimator 


Strat. Estimator 
sub-optimal weights 


Strat. Estimator 
Lip. -optimal weights 


Strat. Estimator 
Optimal weights 


20 


99.0598 
[98.9887,99.1310] 
Var = 131.8089 


99.0839 
[99.0438,99.1239] 
Var = 41.8067 


99.0886 
[99.0488,99.1284] 
Var = 41.2888 


99.0477 
[99.0184,99.0769] 
Var = 22.2549 


50 


99.0598 
[98.9887,99.1310] 
Var = 131.8089 


99.0507 
[99.0129,99.0886] 
Var = 37.3150 


99.0790 
[99.0414,99.1166] 
Var = 36.8408 


99.0444 
[99.0179,99.0709] 
Var = 18.2954 



Figure 11: Numerical results for the auto-call option in the CEV model, with 20 and 50 stratas. 



5.3 Test with an Asian option pricing in the one-factor Schwartz's model. 

Here, we stand in the case of a stock which follows the following SDE: 

dS = e{a - \nS)Sdt + aSdWt, (32) 

under the risk neutral probability. 

The stochastic process X — \n{S) is an Ornstein-Uhlenbeck process: 

2 

dX ^ 9{f^ - X)dt + adWt with ^l^a-^. (33) 

This model, proposed by Schwartz in [25] is an example of stochastic behaviour of commodity prices 
that takes into account mean reversion. Such exponentials of Ornstein-Uhlenbeck processes are very 
common in commodity derivatives models. One particularity in these markets is that the spot is not 
directly observed. Derivatives mostly rely on futures of the considered commodity. Still, one takes this 
one factor "toy" model as a simple case study for our variance reduction method. 

The considered payoff is an Asian option on a discrete schedule of observation dates Iq < ■ ■ ■ < tn = T . 

K is the "strike" of the options whose payoff is I ^ St^ — K 

\ fe=0 

One uses the stratified estimator with the Ornstein-Uhlenbeck process. Optimal product decomposi- 
tions for the criteria (|30p are used and available in table [12] where the numerical results are reported. 

The numerical parameters are 5*0 = 100, 6 = 0.3, a = In(llO), a = 0.3 and K = 100. One performs 
100000 Monte-Carlo simulations in every case. The observation dates are («^) j_|Q „} with T = 3 and 
n — 36. 



Number of strata 
and product decomposition 


Simple 
Estimator 


Strat. Estimator 
sub-optimal weights 


Strat. Estimator 
Lip. -optimal weights 


Strat. Estimator 
Optimal weights 


20 

20 = 10 X 2 


9.8485 
[9.7508,9.9462] 
Var = 248.3156 


9.8867 
[9.8632,9.9102] 
Var = 14.3132 


9.8848 
[9.8624,9.9073] 
Var = 13.1090 


9.8846 
[9.8695,9.8997] 
Var = 5.9547 


50 

48 = 10 X 5 


9.8485 
[9.7508,9.9462] 
Var = 248.3156 


9.8835 
[9.8608,9.9061] 
Var = 13.4003 


9.87862 
[9.8555,9.8983] 
Var = 11.8787 


9.8845 
[9.8702,9.8987] 
Var = 5.2949 


100 

100 = 10 X 5 X 2 


9.8485 
[9.7508,9.9462] 
Var = 248.3156 


9.8883 
[9.8661,9.9105] 
Var = 12.8434 


9.8924 
[9.8716,9.9133] 
Var = 11.3508 


9.8844 
[9.8706,9.8782] 
Var = 4.9664 



Figure 12: Numerical results for the Asian option in the Schwartz's model, with 20 and 100 stratas. 

To perform this computation, one had to use a non-centered Ornstein-Uhlenbeck quantizer. Building 
such a quantizer is a straightforward extension of the centered case. As showed in section [B] if r is an 
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Ornstein-Uhlenbeck process on [0,T] following the dynamic drt = 6{ii — rt)dt + adWt, ro ^ Af{mo,al), 
with non zero values of fi and mo, one has 

_af _af-, ( centered Ornstein-Uhlenbeck process \ 
\^ > \^ correspondnig to mo = ^ = j 

(1)— non stochastic path 

Hence, one only needs to add the expectation (1) to the centered optimal (product) quantizer to 
get an optimal (product) quantizer for the non centered case. An example of such a non centered 
Ornstein-Uhlenbeck product quantizer is available in figure 1131 

6 

5.8 
5.6 
5.4 
5.2 
5 
4.8 
4.6 

0.5 1 1.5 2 2.5 3 

Figure 13: Functional 10 x 2-product quantizer of an Ornstein-Uhlenbeck process starting from ro = 6 
defined by the diffusion drt = 0(/i — rt)dt -|- <jdWt with n = 5, a = 0.3 and 9 = 0.8 on [0, 3]. 




5.4 Commentaries on tiie numerical results 

In every tested case, one notices that the quantization-based stratified sampling method reduces notice- 
ably the variance of the Monte-Carlo estimator. The "universal stratification" allocation proposed in 
theorem 13.21 overcomes the sub-optimal weight allocation. Still in the case of the auto-call, its advantage 
is not very perceptible. 

Moreover, the "optimal allocation" estimation yields a very good variance reduction factor. This 
suggests to implement either a simple prior rough estimation of the optimal allocation or a more sophis- 
ticated algorithm as the one proposed in ^ by Etore and Jourdain. 



A Special case of the Brownian motion for Ry\v computation. 

In this section, one uses the same notations as in section 14.21 We give the closed form of the matrix 
Ry\v iictij))i<i<d o<j<n G Md ni^) which corrcspouds to the affine function AfY\v defined by 

E[Y\V] = AfYwivy ^ " " 

Consider to = < ti < ■ ■ ■ < tn ~ T a, subdivision of [0, T]. 



E 



Wsef'{s)ds 



where /* is an affine function. 
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If t, ^ y) = e[/,*'+^ [x + ^^(2/ - x) + (where vf:^^-'^ 

standard Brownian bridge) 



IS a 



s - t 



1 „W I 



Simple computations lead to: 

ft j+i 



and 



li(rT)("'<'"4'|)-™W'4'¥))- 



2 T 



cos(7r(i — |)^) — tj+i cos(7r(i — 5)^^)^ 



l(^) (sin(7^(^-i)^)-sinW^-i)^)) 



Hence E 



_ -1 n—l n 

Jo Wse^is)ds\Wt, ,Wt^ = E ^W/'t. +B'jWt^+, = E "^j ^-4. with, for every 1 < j < 



Finally one gets the following closed forms for Ry\v •= ((Q^ij))i<i<d.o<j<n- 
• If tj—i < tj < ij+i, 



ttij = A, 



[tj+i - tj){tj - tj^i) 



If < t, = t,+i, a,, = (iTlMzl^ _ '(t,)) 



If t. 



in the other case. 
[ in the other case. 

(The equality cases are useful when dealing with time steps that make the numerical evaluation of 
ej^(tj+i) — ef^ (tj) to close to zero.) 



B Computation of the Karhunen-Loeve decomposition the Ornstein- 
Uhlenbeck process 

In this section, one details the Karhunen-Loeve decomposition of the Ornstein-Uhlenbeck process. Propo- 
sition IB. 31 brings the results together. Section IB. 31 presents the numerical method for computing this 
decomposition. 
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B.l The Ornstein-Uhlenbeck process 

The Ornstein-Uhlenbeck process is defined by the SDE 

drt=e{p-rt)dt + adWt, with cr > and 61 > 0. (35) 
The equation is solved by applying Ito's formula to the process Ut := rte^* . One gets 



rt = roe 



M(l-e-«*)+ / ae^'^'-'UWs. (36) 



If one assumes that vq is Gaussian (rp ^ A/'(mo, CTq)) and is independent from W, the process (rt)t>o is 
Gaussian. One has E[rt] = TOoe"^* + /^(l-e-^*) and cov(r^, r*) = §^e-®(^+*' (e^^™'"^'''*) - 1) + crge-^(''+*l 
Moreover lim Var(rt) — 2- (the long term variance). If the initial variance ctq is equal to long term 

variance the process is stationary and the covariance writes cov(rs,rt) = f^e"^'^"*'- 
The total variance of the process on [0, T] is 

/■^ (T^T / \ / 1 p-2eT 

v„,.-.,* = ^ + (.>-f^)(l-V). 

B.2 The Ornstein-Uhlenbeck covariance operator: 

The Ornstein-Uhlenbeck covariance operator is given by 

tO^/(s) = r ^e-^(^+*) f e2«--(-'*) - l) f{s)ds + C ale-'^'+'^ f{s)ds. (37) 

Computing the Karhunen-Loeve decomposition of the Ornstein-Uhlenbeck process 

T'-'^ is a compact Hermitian positive operator on the separable Hilbert space L^{[0, T]). Hence there 
is an orthonormal basis of V consisting of eigenvectors of T'-^^ and each eigenvalue is real and strictly 

positive. Moreover ||T°^f < ^ + ^(e'^^^ - 1^. One has 

^ouf,.. _ r ^ji^-t) fu\^. ^ r ^ji^-") fu\^. ^ r f 2 _ ^\ -e{s+t) ^ 



Proposition B.l. If f e C([0, 1]), and if g = T'^^ f , then 

g" - e^g = -o^f, (38) 



with 



Proof: 



alg'{Q) = (a^ -eal)g{G) and g' (T) = -0g{T). (39) 
ait) = ^e'''-''f{s)ds + ^e'^^^-^^f{s)ds + [ol e-^(=+*)/(.)ds. 



One gets g"{t) = 6^g{t) — a^f{t). Moreover, equation p9[l comes when identifying expressions with t — and 
t^T. □ 

Proposition B.2. Conversely, if g G C^([0,T]) and if f satisfies equations i38]) and i39l) then g = 

rpOU J 
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Proof: Computing T^'^g" yields: 
An integration by parts yields 



= -^oV(0)e-''* - a^git) + ^g{Q)e-0' - [Oal - ^jg(0)e-«* + O^T^^ g{t) 
= -cr^g{t) + e^T^^g{t) thanks to ([5^]) . 

Now, by necessary conditions, T'~'^ f — \f 4^ a^g = \{0^g — g"). One obtains 

A.g" + (0-2 _ Ae'2)g = 0. (40) 

□ 

Hence the solution of the ordinary differential equation PU)) on [0, T] has the form g{t) — Acos(ajt) + 
Ssin(wt), with w = y^^^^^ A = ^3^^. 

Equation yields uiBaQ = (<t^ — 0CTQ)yl. Hence, function writes 

= if (^wo-Q cos(cjt) + (ct^ - 0(To) sm{ujt)j . 

Hence g'(T) = -6'.g(T) yields 

wcr^ cos(a;T) + [ - w^cr^ + e'cr^ - 6*^0] sin(wr) = 0. (41) 

Conversely, by the same computation, one sees that A„ g]0, ||T'^'^||2] is an eig envalue of T^^ if and 
only if equation (HT|) is fulfilled. 



Proposition B.3. Finally, if one knows the sorted increasing sequence (w„) of the strictly positive 
solutions of equation |^i[ ), the Karhunen-Loeve basis {X^^,e'^^) of the Ornstein- Uhlenbeck covariance 
operator are given by: 

• e^^(i) = -R'n^tdnCTo cos(a;„i) + (cr^ — ^CTq) sin(a;„t)^ , where Kn is the normalization constant. 



If 7^ (0,0), Kn is given by 

^/Kn = ^^o(<^' - - cos(2w„T)] + ia^u;^ (t + ^ sin(2^„r)) 



+ ^('^'-^^'^o)'(7^-^sin(2w„T)). (42) 



Case of a deterministic start point: In this case (ctq — 0), one has 

It _ sin(2a;„T) 
9 2 

Stationary case: In the stationary case, o-q — one has 

e°^(0 = Cn(ujnC08{ujnt) + 6* sin(cLi„t)^ , 
where C„ is the normalization constant. C„ is given by 

cj2/ sin(2cj„T)\ , 02 sin(2L^„T) 



B.3 Numerical computation of the Karhunen-Loeve decomposition of the 
Ornstein-Uhlenbeck process 

As we have seen in the previous section, everything comes to evaluate numerically the strictly positive 
solutions of equation (|^T1) . 
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B.3.1 Deterministic start point 



In this case, (cto = 0), one can check that elements of {■! 
thus the equation comes to 

et0.ii{LoT) = 



-k^\k G N} are not solutions of equation (IHT) . 
UJ. (43) 



The case where 6 — comes to the case of the Brownian motion, hence one assumes that 9^0. 
Solutions of this equation are illustrated in figure [HI One can easily show that a unique solution Wn lies 



in each interval 



T 



7r nTT 
2T' T 



, for n e N* 



/mr 

iim Wn — — 




B.3.2 Non deterministic start point 

Let us assume now that ctq 7^ and consider equation (|^T1) again. The term — cj^ctq + 9a^ — 9^ 
vanishes on ]0, +00 [ if 9'^(Jq — 9a^ > 0. 
First case: 9'^an — 9 





.2^2 fl„2 > Q_ 



Here, everything comes to the equation 

tan(a;T) 



,2^2 



9a^' 



(44) 



Solutions of this equation are illustrated in figure 1151 
One can easily show that Vn e N* , 3luj E 



a solution lies in ]0, ^[ if and only if {9^(Ti 



9a^)T-a^ 



, which is solution of equation (UJ). Moreover 
< 0. 



Second case: 9 an 



9a^ < 0. 

9(j^ — 9^(Tk vanishes for w = V := 



— 6*2 . If y is not a solution of 



Here, the term — cj Oq uu — u uq 

equation (|¥T|l . (i.e. if V does not belong to + k^\k E N}), no other value of this set is a solution, 
and everything comes again to the same equation 

Solutions of this equation are illustrated in figurelTCl One can then easily show that Vn E N*n]y, +00 [, 3luj E 
which is solution of equation (j4ip . Moreover, in every non empty interval 



T ' T ' 2T 



T 



2T 



T 



2T 



k E N* there is another solution of the equation. 
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B.3.3 Pseudo-Code for computing Ornstein-Uhlenbeck eigenvalues 

A pseudo-code for the computation of the n-th eigenvalue of the Ornstein-Uhlenbeck covariance operator 
is given in algorithm [T] In this algorithm, the function search(a, left, right) stands for a root finding 
method. It fills argument a with the root of equation (PT|) that is bracketed by [left, right]. 

In the author's implementation, one uses the Brent's method [Sj as a reliable root finding method. As 
Newton-likc methods. Brent method can take advantage of a guess of the value of the root. (One needs 
a bracketing method: the idea is to start from a small interval around the guess that is geometrically 
expanded, until the limiting range [left, right] is reached. ) 

B.3.4 A numerical guess for the value of a;„. 

As we have seen, we use a root finding method for evaluating numerically the value of cj„. In this section, 
we propose a numerical guess for this quantity, that can be used as a starting point in the root finding 
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Algorithm 1 Ornstein-Uhlenbeck Eigenvalue {6, a, ao,T,n) 
Require: 6* > 0, ct > 0, ctq > 0, T > 0, n > 1. 

if (To = then 

{There is a unique solution Wn of 14111 in the interval ] ^^ — ^ , [. } 

se&rch^uj^ , 2^ 5 ^ ' 
else 

{Here ctq > 0.} 

if (eVg - 61(72) > then 

{Xlie vertical asymptote of the right hand of equation 1441 lies on the left of 0. } 

if (6i2(Tg - 9a^)T - (72 < then 

{There is a unique solution Wn of I I41I1 in the interval ]0, } 

else 

{The smallest stricly positive solution t«i of equation 1411 lies in the interval ^I-} 

search(w„,2H:,2^ + #)■ 
end if 
else 

{The vertical asymptote of the right hand of equation 1441 lies on the right of 0. } 

if i^^^ - # > ^Jo^^ then 

search(w„, j^^"" , + 
else if - # < y^6lg - 02 then 

else if ^ - ^ < ^Je^^ and ^^^^ ^ # > ^^fT"^ then 
search(u;„,ft - ^^^e^^-O"^). 



else 




end if 



end if 
end if 

Return Ar,. 
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method. 

Function tan is approximated on ] — ^, ^[ by the rational fraction ip{x) :— 4-,2 ' — , which is a 

good uniform approximation of tan on this interval. || tan— i/^Hoo ^ ' ^ = ^^2t^ ~ 0.02075. 

Now, in the case of the Ornstein-Uhlenbeck eigenvalue computation, for a deterministic start point, 
equation (j43p can be approximated by 

9^{LJnT + m:) = — w„ n>l. (45) 

This comes to a polynomial equation of degree 3 for every n > which has a unique solution w^"^** £ 
] 2h: _ ^ 2HL This numerical guess yields a good accuracy for approximating the value of w„ . 
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